Analyzing Attribute Dependencies
نویسندگان
چکیده
Many effective and efficient learning algorithms assume independence of attributes. They often perform well even in domains where this assumption is not really true. However, they may fail badly when the degree of attribute dependencies becomes critical. In this paper, we examine methods for detecting deviations from independence. These dependencies give rise to “interactions” between attributes which affect the performance of learning algorithms. We first formally define the degree of interaction between attributes through the deviation of the best possible “voting” classifier from the true relation between the class and the attributes in a domain. Then we propose a practical heuristic for detecting attribute interactions, called interaction gain. We experimentally investigate the suitability of interaction gain for handling attribute interactions in machine learning. We also propose visualization methods for graphical exploration of interactions in a domain.
منابع مشابه
Analyzing Direct Non-local Dependencies in Attribute Grammars
Describing the static semantics of programming languages with attribute grammars is eased when the formalism allows direct dependencies to be induced between rules for nodes arbitrarily far away in the tree. Such direct non-local dependencies cannot be analyzed using classical methods, which enable eecient evaluation. This paper presents a new technique for analyzing such dependencies. Attribut...
متن کاملDynamic Discovery of Fuzzy Functional Dependencies Using Partitions
A functional dependency describes the relationship between attributes in a database relation. It states that the value of an attribute is uniquely determined by the values of some other attributes. It serves as a constraint between the attributes and is being used in the normalization process of relational database design. Therefore the discovery of functional dependencies from databases has be...
متن کاملEfficient search for statistically significant dependency rules in binary data
Analyzing statistical dependencies is a fundamental problem in all empirical science. Dependencies help us understand causes and effects, create new scientific theories, and invent cures to problems. Nowadays, large amounts of data is available, but efficient computational tools for analyzing the data are missing. In this research, we rise to the challenge, and develop efficient algorithms for ...
متن کاملIncorporating record subtyping into a relational data model
Most of the current proposals for new data models support the construction of heterogeneous sets. One of the major challenges for such data models is to provide strong typing in the presence of heterogenity. Therefore the inclusion of as much as possible information concerning legal structural variants is needed. We argue that the shape of some part of a heterogeneous scheme is often determined...
متن کاملAttribute dependencies for data with grades I,
This paper examines attribute dependencies in data that involve grades, such as a grade to which an object is red or a grade to which two objects are similar. We thus extend the classical agenda by allowing graded, or “fuzzy”, attributes instead of Boolean, yes-or-no attributes in case of attribute implications, and allowing approximate match based on degrees of similarity instead of exact matc...
متن کامل